Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jules - improved how we specify and find tesseract data. #4093

Merged
merged 2 commits into from
Nov 27, 2024

Conversation

julian-smith-artifex-com
Copy link
Collaborator

No description provided.

docs/functions.rst
docs/installation.rst
    Updated Tesseract information.

src/__init__.py:
    Removed global TESSDATA_PREFIX as not required any more.
    Pixmap.pdfocr_save()
    pdfocr_tobytes()
        Use get_tessdata() to infer tessdata if unspecified.
    get_tessdata():
        Added optional `tessdata` arg; is returned directly if set.
        Raise exceptions if we cannot find tesseract data (used to return
        False.)

src/utils.py:
    Removed global TESSDATA_PREFIX as not required any more.
    get_textpage_ocr() (and Page.get_textpage_ocr()):
        Use get_tessdata() to infer tessdata if unspecified.
…inux.

We need to add PYMUPDF_SETUP_PY_LIMITED_API to CIBW_ENVIRONMENT_PASS_LINUX.
@julian-smith-artifex-com julian-smith-artifex-com merged commit 8003dec into main Nov 27, 2024
2 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Nov 27, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants